Add `get_model_lineage_dev` CLI tool #420

VDFaller · 2025-10-29T18:05:59Z

Summary

Added a tool for parsing the manifest to get the lineage.

What Changed

Just parses then reads the manifest.

Why

So that it's usable by non-cloud customers.

Checklist

I have performed a self-review of my code
I have made corresponding changes to the documentation (in https://github.com/dbt-labs/docs.getdbt.com) if required -- Mention it here
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

There are some differences in the outputs. I can try to line them up if needed.

Prompt Call & Response

"Get me the lineage of this model"
- Bad - doesn't use mcp
"use the dbt mcp server to tell me the full lineage of this model"
- Good - calls both directions, recursively
"use the dbt mcp server to tell me the lineage of this model"
- Bad - ran jq on the manifest
- 2nd run - Fine - calls both directions recursively
"use the dbt mcp server to get me the children of this model, including tests"
- Good - Gets non-recursive children (used the name not the unique_id) and had the tests included.

src/dbt_mcp/discovery/models/lineage_types.py

src/dbt_mcp/prompts/dbt_cli/get_lineage.md

b-per · 2025-10-31T12:16:57Z

Thanks Vince!

I am wondering if we should not create new tools instead of making get_model_parents/get_model_children be able to either query the Metadata API or the local artefacts.

Here are my Pros/Cons of having new tools for get_model_parents/children dedicated to the CLI/manifest

Pros
- it would be easy to activate those along the rest of the CLI tools and deactivate the dbt platform ones if needed (just activating the CLI toolset)
- people could query both the get_model_parents from the metadata API and the new local tool in a single LLM session/context to compare the changes that they are introducing
- it feels simpler to understand what tool does what and easier to know which ones someone might want to activate/deactivate
Cons
- we already have many tools and it would add more (but realistically most people shouldn't activate all tools and tweak those to their use case)
- we'd need to find good names and descriptions to explain to the LLM the difference between the children from metadata and the children from manifest if we want to avoid the LLM to get confused

So, I am in the camp of adding this functionality but in new tools ; and I'd be keen to hear other people's opinions about it.

As a side note, would you be able to set up signed commits for this repo? We can bypass this check at the PR level as repo admins, but this repo expects all commits to be signed now.

VDFaller · 2025-11-05T19:06:36Z

@b-per Crap, that's how I originally had it (shouldn't have squashed 😢 )

I didn't like it because when I was trying it out, exactly like you pointed to, it seemed to arbitrarily pick which tool it used. So I could run very similar queries and it would give two different results. We could give better names/descriptions so the tool wouldn't get confused but I don't think the user would necessarily know if they were getting the answer they expected.

me
- Get me the model parents for jaffle_shop.orders.
MCP
- Okay there we've got some options, do you want production parents, or local parents, recusive or not?
me
- What?

I think it would just run the tool and the user would see it asking to run get_model_lineage(...) and go "that seems right", without knowing the nuance.

If it were to be two separate tools, would you think it should be a cli tool or a discovery tool. My entire thought process was "Discovery is NOT just platform", especially after listening to Jason's talk at Coalesce where he talked about them abstractly. This very much relates to #418 in my head.

Rebased with gpgSign on, no idea why it was set to false for this repo.

Also on this

we already have many tools and it would add more (but realistically most people shouldn't activate all tools and tweak those to their use case)

Do y'all have data to show that's the case? I'd expect people to just give it everything.

DevonFulcher · 2025-11-05T20:37:16Z

Hey @VDFaller and @b-per, I'm sorry for the back-and-forth on this. I told Vince that I was appreciative of sticking with the existing get_model_parents/get_model_children tools and routing between the local or remote version depending on the user's config. Let's get aligned on this. I think the Pros you listed are valid, Benoit, but they may not be the features worth optimizing greatly for.

I like the router approach because the agent typically doesn't care whether the information is coming from a local or remote source; the user cares more about that. Also, with the latest config changes, it is quite easy to point the agent to local or remote. If DBT_HOST is present, use GQL; otherwise, use the local version. Turning on/off more tool options depending on local or remote usage is more flexible, but it is also more complex, and I don't think most users want to use both local and remote at the same time.

Furthermore, this router approach can be applied to the Semantic Layer tools in the future. It is a source of frustration for some users that these tools don't work locally.

Add a fallback path for Discovery tools to get use CLI functionality Add ModelLineage type with main constructor `from_manifest` The CLI path will not work until auto-disable is functioning correctly.

add endpoint args to the tool.

src/dbt_mcp/dbt_cli/models/lineage_types.py

Copilot

Pull request overview

This PR adds a new get_model_lineage_dev CLI tool that enables non-cloud customers to retrieve model lineage information by parsing the local development manifest, supporting upstream, downstream, or bidirectional lineage queries with optional recursive traversal and test filtering.

Key Changes:

Implemented ModelLineage data model with support for recursive parent/child traversal and cycle detection
Added get_model_lineage_dev tool to the dbt CLI toolset with configurable direction, recursion, and exclusion options
Updated existing discovery prompts to clarify distinction between production and development lineage tools

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/dbt_mcp/dbt_cli/models/lineage_types.py	New module implementing ModelLineage, Ancestor, and Descendant models with manifest parsing logic
src/dbt_mcp/dbt_cli/tools.py	Added get_model_lineage_dev function and _get_manifest helper to load and parse manifest.json
tests/unit/dbt_cli/test_model_lineage.py	Comprehensive test coverage for lineage parsing with various scenarios
src/dbt_mcp/tools/tool_names.py	Registered GET_MODEL_LINEAGE_DEV tool name
src/dbt_mcp/tools/toolsets.py	Added new tool to DBT_CLI toolset
src/dbt_mcp/tools/policy.py	Defined tool policy as METADATA behavior
src/dbt_mcp/prompts/dbt_cli/get_model_lineage_dev.md	Documentation for the new tool with usage examples
src/dbt_mcp/prompts/discovery/get_model_parents.md	Clarified this tool is for production manifest
src/dbt_mcp/prompts/discovery/get_model_children.md	Clarified this tool is for production manifest
README.md	Added get_model_lineage_dev to CLI tools list
.changes/unreleased/Enhancement or New Feature-20251203-203944.yaml	Changelog entry for the feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/dbt_mcp/dbt_cli/models/lineage_types.py

Co-authored-by: Copilot <[email protected]>

DevonFulcher · 2025-12-04T23:52:36Z

src/dbt_mcp/dbt_cli/tools.py

+    def get_model_lineage_dev(
+        model_id: str,
+        direction: Literal["parents", "children", "both"] = "both",
+        exclude_prefixes: tuple[str, ...] = ("test.", "unit_test."),
+        *,
+        recursive: bool,
+    ) -> dict[str, Any]:


Non-blocking: I think this should have essentially the same function signature as the other lineage tool: https://github.com/dbt-labs/dbt-mcp/pull/461/files#diff-6d91f0721d8dcde8199de504338811a7063757ec13f32eca508bfbc8b663a54bR390-R396

I'll make a separate PR once they're both in to align them. Cool?

src/dbt_mcp/dbt_cli/tools.py

DevonFulcher · 2025-12-04T23:55:47Z

src/dbt_mcp/dbt_cli/tools.py

+        import json
+
+        _run_dbt_command(["parse"])  # Ensure manifest is generated
+        cwd_path = config.project_dir if os.path.isabs(config.project_dir) else None


Non-blocking: when the server starts up, we should make all paths absolute if they aren't already.

src/dbt_mcp/dbt_cli/tools.py

DevonFulcher · 2025-12-05T00:16:03Z

src/dbt_mcp/dbt_cli/models/lineage_types.py

+                ]
+            else:
+                # Build nested descendant trees. Prevent cycles using path tracking.
+                def _build_descendant(node_id: str, path: set[str]) -> Descendant:


It seems like we could consolidate the two versions of this function into one. The differences are minimal.

had to do some funky casting to make mypy happy. let me know if you feel like it's better.

refactor ModelLineage.from_manifest for readability

VDFaller requested review from a team, b-per and jasnonaz as code owners October 29, 2025 18:06

VDFaller commented Oct 29, 2025

View reviewed changes

src/dbt_mcp/discovery/models/lineage_types.py Outdated Show resolved Hide resolved

src/dbt_mcp/discovery/models/lineage_types.py Outdated Show resolved Hide resolved

src/dbt_mcp/prompts/dbt_cli/get_lineage.md Outdated Show resolved Hide resolved

VDFaller marked this pull request as draft October 29, 2025 20:39

VDFaller force-pushed the model-lineage-cli branch 2 times, most recently from a48b7a2 to 76ac932 Compare October 30, 2025 17:27

VDFaller force-pushed the model-lineage-cli branch from 76ac932 to c70b52e Compare November 5, 2025 19:11

VDFaller added 3 commits November 6, 2025 18:51

Add functionality to get model lineage via the manifest in CLI.

637e5ca

Add a fallback path for Discovery tools to get use CLI functionality Add ModelLineage type with main constructor `from_manifest` The CLI path will not work until auto-disable is functioning correctly.

restore to merge-base 7fa18eb

bc06c32

separate cli and discovery tool to get_model_lineage_dev

adf92cc

VDFaller force-pushed the model-lineage-cli branch from c70b52e to adf92cc Compare November 7, 2025 02:52

VDFaller added 3 commits November 6, 2025 19:09

reorder args

6d5db43

add endpoint args to the tool.

Add specificity to tool descriptions to delineate

adeba32

sources have identifiers, not models.

c041c1a

VDFaller force-pushed the model-lineage-cli branch from 08b5157 to c041c1a Compare November 7, 2025 04:40

VDFaller marked this pull request as ready for review November 7, 2025 16:35

VDFaller changed the title ~~CLI fallback for get_model_parents/get_model_children discovery tools~~ Add get_model_lineage_dev CLI tool Nov 7, 2025

VDFaller commented Nov 7, 2025

View reviewed changes

src/dbt_mcp/dbt_cli/models/lineage_types.py Outdated Show resolved Hide resolved

b-per and others added 4 commits November 12, 2025 12:19

Merge branch 'main' into model-lineage-cli

d818a18

Merge branch 'main' into model-lineage-cli

74f2309

Merge branch 'main' into model-lineage-cli

e832496

add changie

9bcb837

VDFaller requested a review from a team as a code owner December 3, 2025 20:40

Merge remote-tracking branch 'origin/main' into model-lineage-cli

16591c5

DevonFulcher requested a review from Copilot December 4, 2025 23:43

Copilot AI reviewed Dec 4, 2025

View reviewed changes

src/dbt_mcp/dbt_cli/models/lineage_types.py Show resolved Hide resolved

src/dbt_mcp/dbt_cli/models/lineage_types.py Outdated Show resolved Hide resolved

Apply suggestions from code review

d49a815

Co-authored-by: Copilot <[email protected]>

DevonFulcher reviewed Dec 4, 2025

View reviewed changes

src/dbt_mcp/dbt_cli/tools.py Outdated Show resolved Hide resolved

DevonFulcher reviewed Dec 4, 2025

View reviewed changes

DevonFulcher reviewed Dec 5, 2025

View reviewed changes

src/dbt_mcp/dbt_cli/tools.py Outdated Show resolved Hide resolved

DevonFulcher reviewed Dec 5, 2025

View reviewed changes

create Manifest structures for better type validation

5a48ff7

refactor ModelLineage.from_manifest for readability

VDFaller mentioned this pull request Dec 5, 2025

Align get_model_lineage functions. #477

Open

get rid of NodeT for better readability

3de7eee

Add get_model_lineage_dev CLI tool #420

Are you sure you want to change the base?

Add get_model_lineage_dev CLI tool #420

Uh oh!

Conversation

VDFaller commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Why

Checklist

Additional Notes

Prompt Call & Response

Uh oh!

Uh oh!

Uh oh!

Uh oh!

b-per commented Oct 31, 2025

Uh oh!

VDFaller commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DevonFulcher commented Nov 5, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

DevonFulcher Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

VDFaller Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DevonFulcher Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DevonFulcher Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

VDFaller Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add `get_model_lineage_dev` CLI tool #420

Add `get_model_lineage_dev` CLI tool #420

VDFaller commented Oct 29, 2025 •

edited

Loading

VDFaller commented Nov 5, 2025 •

edited

Loading